ED 2 W 1103 

&0THOR 
TITLE 

iHSTITOTIOH 

.SPONSVI&EIICT 
POB DATE 
HOTE 



EDRS P.BICE 
DESCRIPTQBS 



DOC 0 HE NT BESOHS 



7 



TH 810 tt17 



OOP, Caroline H. ; Barcikowski, Robert S. 

The Evaluation 'of a Model for the AssessmeDt of Class 

Progress. .. . . 

Ohio Oniv.V kthens. 

Ohio state DjBpt. of Eauca€io!ir CQlumbus, " • • 

Apr 8*1 • ' . 1.1. 

51p.: Paper presented at the snnuiil Meeting of the 
American Educational Research Association (65th» Los 
Angeles,. CAr April 13-17 r 1981). ; 

BP01 Plus Postage, PC Hot. Available from EDES. 
■♦AcSidemic Achieveaent: Attitude Measures; Cognitive 
Heasurement; *Evaluation Methods;- Formative 
Evaluation: Higher Education: instructional 
Improvement: *Item Sampling: *Models; *Program 
Evaluation. 



ABSTBACT »■■'•' 

Demands for more complete information on educational 

programs have emanated from national, state and local sources. Their 
focus is on the processes that are occurring in individual ■ ^ ^ 
classrooms; The information that.its collected to provide insight into 
educational programs is customarily summative in n«ure^,answerlr^. 
for eiample, questions regarding student progress toward ^,^^„^ 
accomplilhmint of objectives: Hete; a model is evaluated which allows 
that decisions be made for the benefit of tTie students still enrolled 
in classes, not merely the students of future classes. This: model, 
the B-model. is concerned with. cb^rnitive and affective outcomes. 
Results show that' this model provides researchers and school 
administrators with a sensitive me^asurement approach, which is 
ecShoiicri ?J terms of teacher /student .time, with which to jeasure 
student progress an*, perhaps, teacher effectiveness throughout the 
school year. (Author/GK) 



. * Reproductions supplied by EDRS are the best that can be made * 
- V .• ' from the original document. ^..^..I 



U3. DCPARTMENT OF EDUCATION 

NATIONAL INSTITUTE OF EDUCATION 
EOUCATIONAL RESOURCES INF0R^AAT10N 

CENTER (ERIC) 
^This document has ^jeen reproduced as 
received fro/ft the person or orBanirftion 
originating it. 
^ ( ] Minor changes have been made to improve 
reproduction quality. * 

• Points bt view or opinions stated in this docu- 
ment do?iot necessarily represent official NIE 
position or policy. • ■ 



\ 



V - THE EVALUATIOJf OF aI MODEL FOR THE 
^ASSESSlvlENT OF CLASS PROGRESSl 



Caroline Upp and Robert S. Barcikowski 
Ohio University 



"PERMISSION TO REPRODUCE THIS 
MATERIAL IN MICROFICI^E ONLY 
,HAS BEEN GRANTED BY 



TO THE EDUCATIONAL RESOURCES 
INFORMATION CENTER (ERIC)/' 



Paper presented at the annual meeting of the 
American Educational Research Association, 
Las yGigeles, April, 1981* 



The preparation of this paper was partially supported through anj^ j 

Ohio State Department of Education Project 419 • Year II Research : ' 

and Development Grant. The opinions expressed herein do not. 

necessarily reflect the position or policy of the Ohio State . * 

Department of Education, and no official endorsement should be 

inferred.- . , 



■' • ' ' V . ABSTRACT ■ - 
Demands for more complete- information on educational programs 

:*have~emanated from national, statTanT local sources; -TReif ~fScuTr~ 
in the final analysis, is on the^ processes that- are occurring in 
i^ndividual classrooms. The information that is collected to provide 
insight into educa^onal programs is customarily summative in nature 
answering such ^»i^ions as "What is the average reading level of 
fourth grade itii^^s in May?" Here,- a model is evaluated that Te- 
quires- measurements thr,o^^hpu.t a s<Jiool term so that decisions can ^ 

' be made that will beiief it f^e students still enrolled in classes, 

■ ■ ■ ■ /' ■ ■ - . 

not merely, the studentSA:pf future classes. 



'ry'f.. The- Evaluati(m>,of a Model for the Assessment . ^ 

■ ^ . ■ . ■ • ~^ ~ ~~- Z [ " . • . 

■ of. Clatss Prpgress ' ' * / 

- — , i ' Introduction \ ''\ ' ^ 

Evaluatlion in the field of education has been defined by Cronbach 

.as "the collection and use of .information, to make decisions about edu- 

cational programs" (Cronbach, 1963). Such evaluation has been an aim . 

of educators and- laymen alike. At local, state., and national levels, 

"v. • * e < . ■ ■* 

demands for information about, the effectiveness of educational programs 

• /■••: ■ ^ ' - ' ■ - . ■ . 

are heard, -Evidence of these demands can be^fbund in Michigan's 

^ ' ■ " ' !'-'■ . ' ■'^ ." ■ • / 

- state-mandated. accountability, program' (Porter, 1976), the -ninth annual 
Gallup Poll of public attitude^ toward education (Gallup, 1977), and , 
concern declining test scores (Ebel, 1976). Legislative bodies 

in state houses' and the/Congress have authorized funds to be used 
expressly^ .for evaluating -educational programs, (Worthen S Sanders, 
\ .1973)-;..;:''" . / ' ' ' /. ■ ^ • '\. ■ ' \ 

Thpse demands for more complete infp^Mtion on educational pro- 
grams §ave emanated from national, state, an^ local sources. .Their 
focus; in the final analysis, -is on the prc^cesses that are occurring 
in individual classrooms. Thfe information that is collected to, pro- 
vide insight into educational programs is customarily suramative in 
/ natiire, answering such questions as ''What is the average reading ^ 
^ level of fourth grade students in May?" or "How well, did students 
taking thjis year's ACT tests perform as compaTed<»with thpse students 
who were tested in the same manner five or ten years ago?"^ 
. • These questions are? important ones, to be sure.^ However, there 



are otiheri qiiest ions that «re of at least equal importMcle to the 
indiJvidual xla$sro5^ tKe' Ipcal sch^ These 




igressirig ' 
he term?' 




introduced, , 



of ' a^uimilated 



arfe such: questions .as^ the follow^^^ v .^B^m^-^ 

1 . How is this particulat group ojE sl 
tbward accoiqil'isiupent' of thj? 0^^^^^^ 

2. As the school tern proceeds and ri^ 
^ are the studetits^ retaining their learairif 

. V . : weeks? - * ,!■ ■ ' 

'3. Has a point been rejached at which -^he cu'| 
learning is flattening or descending? 

4 , What are the att itudes of the student s toward' the sub j ect 

' at' hand? ' - * ^ / 

5. Are . these attitudes changing? ; If. so, are they becoming 
more pp sit ive or more negativie? 

6 Is this grou^ ofrstudents leafming at a rate comparable 

to that of simii^ar groujis? I 
These question^ cannot be answprted by suinmative measui;ements 
taken only af - the of the term, but rather must be answered by 
. means of frequent testing throughout;. the school term. If such 

measurements are taken throughout the term, decisions can be made 

* f* 

that will benefit the students still enrolled in the class, not 
^ merely the students who yfill be enrolled in future terms. 

The prjincipal barriers preventing the collection of frequent 
measurements have been concerned with the omnipresent factors of 
time and money. Frequent testing complete enough to provide accurate 
information on students* cumulative progress toward yearly goals has . 
been extremely costly in terms of teacher and pupil time and testing 



expose/' Barcikowski and ::Upp QSTS), however^ have suggested an 
approach i/ref e^nred to here as . the B-model, based on multiple matrix ^ 
sampling vhich may enable frequent, accurate collection of such data 
at •a^fraction of "the customary cost . The utilization-.^jf a multiple 
matrix sampling process requires that only^ few test items need be 
administered, to each student. Because tlte questions to ° be answered , 
by the testing prog^jam refe^'tp group progress i and not to individual 

■ ■ , - ■ ■ • - ■ . \ 

gchi'evRitient, accurate estimates can be derived fxpm this small number 
of Items administered to .'each student. Fof example, instead of a test 
of 150 items for %ach student , a test of twelve to fifteen items per 
student mfiy be' sufficient to provide a reasonably .accurate measurement 
of the achievement o'f the class. Computer compilation, printing, 
and scoring of the tests provide accurate d^ta on class status with 
minimal . input' of teacher time. The B-model is designed to measure 
progress toward both cognitive and affective objectives for the term 
in this fashion. 

. Class .Pr6gress: What Should be Measured? ^ 

p^re a measurements system can be initiated for evaluating 
diss. progress, some decisions must be made regarding, the aspects to 
be evaluated. The outcomes of education are mulftdimensional, as is 
" the process of. education itself. Some. of these dimensions are 



/ 



cognitive; some ar,e affective; others relate to moral character, 
a'djustment to life, self-confidence, and citizenship. Ii) evaluating 
•thp perforraancfe' oi si class of pupils', any . of these dimensions- may be. 
assessed. The general public demands assessment of cognitive out- 

• cojnes (Porter, 1976; Gallup, 1&77; Ebel , 1973) ; and it does appear 

* that any evaluation of claiss pyfgress must give attention to 



[i^ any ^luation of class pMgres 



cognitive skills;. . 

One of the goals of teaching, however, is to encourage students 
to go Wond the subject matter being covered in class and to encourage 
deeper: and-wider pursuit, of the sObject. A study by French (1961) indi- 
cates that favorable student attitudes db lead to increased time spent 
at that activity and choice of further scholastic pursuits related to 
.that field. It therefore -is. desirable to have favorable student atti- 
tudes •towai^4,;iSe- subject, matter being studied. While attitudes toward 
the subject;^*! hdve been formed many years previously, Ausubel (1968),. 
Biehler ClWl), and Blair, Jones, and Simpson (1967) agree that 
attitudes are not irtmutable and can be changed by skillful handling. 
Simonson C1976) rtports a study in which attitudes were deliberately 
' changed through use of the dissonance theory. 

These studies indicate two things: that positive attitudes ^, 
■ toward subject, matter are desirable^^ and that attitudes, can be changed 
3,n the desired direction by use of certain known techniques. If 
this is true, measurement of attitude, and especially measurement of 
attitude change during the term should be of value to the teacher. 
Johnson (1974) reminds us of the following: 

All learning hais affective components. No matter what 
knowledge or skill-s a student masters, he will have feelings 
about the pr6ce?s and results' of instruct;ion; In mastering 
■ the skills of reading or in' learning about history, a student 
develops feelings, about reading and about history, as well 

as about learning and instruction, that will inf lueijce his 

'■ ^ ' . . . . ■ 

behavior in the future. Because students' affecti*;^ responses 

to school expexiencef influence future behavior, the development- 



of positive affective reactions may ^e more iHtpoftint. thai -the 
• mastery of specific knowledge and skills, ^ It does little - 
good to teach . a student to read if he ends .up disliking reading-, . 

• ~" aVd~ avoids" it whenever possible. (Johnson .in Walberg, 1974, 

■ f.. 99): . ^ ^/, .; 

' Some authors. (e.g., Ebel,/l972) woujd limit; asgfessmeht to cogni- 
tive ^comes alon^'. * ^hers (Johnson , 1974 ; Krathwohl &loom. § . ^ ' 
Masia, 1964) believe that effect ivi' coraponents sl^ould" be included in 
the evaluatimi model. Still other witers might wish to include ^; 
assessment -of soin9> of the other outcomes enumerated above. It is ' . 
apparent, however ,■ that no practicable evaluation plan can include 
all of these components. 

Model and Purpose > 

Model. The B-mo^el is concerned with cognitive and affective 

, i , < 

• /• . . ■ ^ 

outcomes. This is not intended to negate the importance of the other 
outcomes. It recognizes the fact that piipil gain scores i/such ^ 
nebulous areas as "citizenship and morai character are extremely 
difficult to.. measure and' are' affected by many factors other than the 
instructional program. TJie model presented here is a systematic ap-. 
proach to monitoring class progress in the cognitive and affective 
areas periodically throughout the school term.^ 

Other models for e.Valuatlng class progress measure accomplishment 

* only at terminal points, e.g., the end of a chapter, the er^d of a' 
unit, .the end :<jf a semester or school year. . Because this model tikes 
re^lar^ frequent measurements of progress towards cognitive and 
affective jbutcomes, ^eVeraL desirable "results may be achieved thap ' 
are lackirigYMtli. traditional models. First, the instructor will know 



at frequent, periodic intervals how his class is progressing toward 

accomplishment 65 his objectives for the term. Second, the 

instructor can accuxately assess the amount 'of progress- from one 

''period, to theWxt. Thirds if t:he slope* of the learning curve • 

kpp^s to flatten or to descend, the instnictor can take remedial 

action at once. Fourth, the students themselves can observe the ^ 

pr6gress of their class as a whol^. Fifth, if the model is used in 

the isa^e kinds of classes with the same type or students, typical " 

learning curves will become apparent. These can serve as standards. 

of comparison 'fpr teachers and their supervisors and aid in the — v 

■ . ^ \ ■ - 

identification of effective teaching. The model will therefore 

^ ■ ■ ■ : 

serve three purposes: ^ 

* . ■ . i* 

1. to monitor class progress, 

^ ' ■ ' ; ^ ' ■ ■ 

2. to motivate students, ^ 

3. to serve as a tool to assist in asTsfessment of teaching. 

• ^ ' ' ' . .. ■■ : 

Pttrpose . ^The present study pijovided for implementation of this 

model. The purpose of this study was to gain valuable information 

regarding the feasibility and practicability of the B-model for 

classroon use. This study was visualized as the first in a series 
■ " ■ ^ ■ . - /. " * ^ ■ ' .il ^ 

of trials of the model in various settings to determine its potential 

.. ■ • ^ .. ■ ' ■ 

utility for measuring gains in achievement' and changes in^ attitudes 

of students over' a period of time. / 

The B-Model - \r 

' — . *> ■ ■ • 

The B-model measures change in the level of piipil achievement 
by the use of^multiple matrj-X sampling (for more information on this 
method see Appendix A, Multiple Matrix Sampling) . The, model extends 
beyond other designs for measuring class progress in twovimportant 



respects. The B-model includes -mfeasuiceinent of attitudinal as well 

• ■ • . . ♦ ' ■ ■ . . • ■■ * ■ . 

a^ cognitive changes and includes/ more than just pre-test and post-* 

» ■ ■ . ' ' '■ y . ' 

test measures as recommended by Shoemaker (197^). It calls for 
testing- at eight to ten periodic intervals throughout .the* school 
term', thus providing the teache^ with valuable information to 
guide the instructional program. 

The unique nature of the model with. its multiple measurement^ 



taken during the term may be .illustrated by contrasting it with a 
typical example of program evaluation made in the traditional way. 
A ^udy reported by Leinhardt (1977) designed to evaluate a program 
of study included data from four different sources: standardized 
tests, questionnaires, videotapes, and student records. These 
measurements,^ however, were taken only in the fall and in the spring. 
This pattern of fall-spring measurement l^as been ^typical of previous 
attempts to monitor class performance or to evaluate program success. 
'\ The B-^model to monitor cl/ass progress, with respect to'student 

knowledge and--dttitude, would consi$t of observations made at jsqual 
* intervals throughout the school tenp,- using multiple matrix sampling # 
techniques as described in Sirotnik ; (1974). Although there is 
no general agreement on what constitutes a learning curve for a given 
group (Hilgard § Bower, 1966), most educators would agtee that given 
a set, or item domain, 9^f 1test items designed to measure what a teacher' 
is teaching, the percentage of items answered corredtly by the\teacher' s . 
students should increase over time. The amount of increase would.be 
dependent on a* number of factors, including intelligence, motivatip^, 
social conscience level, etc., of the s1:udents, and the effectiveness ^ 
of the teacher. ' ' 



ERIC 



• • The corappnenf s of the moder are the following; 

r* A set of -objectives for the particular subject matter area, 

* 2. \an item domain of test items: which will measure these 

■ \ - ■ . ■ ' * • ■ ■■ ■ 

Objectives, * * i * 

3. a coinputer^syst em which will randomly select items from the 
item domain to be used_for the periodicj'testi, 

4. ^ a system that is both efficient and effective for producing, . 

administering, and. scoring the tests, and 

5. the return of information on class. progres? to the instructor 
and to the students- 

Shoemaker (1975) indicated -the ^desirability for achievement 
tests derived from instructional r-bx'ograms by means^of the' item uni/- . 
verse concept . . ' , 

An instructional program and its associated item universe 
, ' are isomorphic. For.eyfery instructional program there exists 
one and oiily one item universe that is inseparable"^ conceptually 
' ^ ' from it. The item universq is an operable definition of the, 
instructional program, (pp." 128-129) * 

Shoemaker further, asserts /that the instruct ionial program is the 
^vehicle for. providing the necessary skills to answer correctly all 



items in the item'universe, not by teaching the correct responsMto 
each item, but By teaching algorithms or concepts n^d^d by Students 
to* respond appropriately. Most item universes, however, are so 
large afe to, be unmanageable and are therefore impossible to work 
with. A workable item universe cart be found in the form of the "item 
domain". An item domain is^ a definable and enumerable subuniverse Sf' 
items selected from the item universe in such a way thiat it includes 



every ar^a. in the item universe. Thus achievement, as measured by the " 
item domain, will be equivalent to achievment measured by the item . 

universe. An item domain for* a given area might realistically include " 

■ • _ ♦ . • . ■ . ^ •■ 

500 to 2000 Items. ^ 

Assessment di group progress toward. accomplishment of the coursa 
objectives can be^ made by use of muKiple matrix sampling in which the 
^tem domain is divided into 'small subtests and each subtest ad^nilstered 
to a group of students sampled randomly from t>ios6 participating;^ . 
in the instructional program. Because each student respondi to only 
a small portion of the total number of items available, the testiifg 
program need not utilize a' large .amount of class time. 

The B-model does not assume that classes should progress at equal 
rates. After each testing period, the means of cognitive items for , 
each class are to be plotted as a curve of accumulated learning. From ' 
previou§ sfudies of such curves -(Hilgard. 1956) . it is assumed that 
the learning curves will show pattern^ sijnilar to those in Figure. 1. 





weeks in School 



Figure 1 . 
Examples of Expected Class 
r . Learning Curves ' 



It will be observed that not all classes begin or end at the same 
' place on the scale, nor do all show the same rate of improvement. 



N 



This reflects the differ^t abilities of the v.afious groups' and is 
to be expected.' Each schobl systpm would have to develop its ovm set 



of learning curves to discover what kinds of pajiterns should be 
rexpet^ted iti various teaiching positions. , " 

' The sanie considei-ations described above for cognitive items must 
rhe utilized 4n" dealing with the affective domain. Affective objectives 
must be outlined clearly;, items miist be written to test students'^ 
attitudesi periodic- multiple .riatrix sampling can be used tb evaluate 
.the.studentsrpdsition with regard to the objectives. In this way the 

B-model- measures both 'cognitive and affective aspects of class per- 
. formance. The goal in t;eaching is to increase the Cognitive level of 
.the students With regard to th? subject matter and .to improve', or at 
least * to maintain, affective dis^^^ . . 

' For impjementation of the B-m^^ 
' poiiit is the writing of a #et of objectives that describe in detail 
all of the teacher's attitudanal and cognitive goals for the course. 
If sever.al different teachers teach the same course > this should 
either be a cooperative project or the project should be assigned to 
one or a few teachers and car^ully reviewed by all teachers who will 
be in^^olved with the course. A process of revision of objectives 
should continue until all. teachers can agree on the following: 
1, -The objectives listed are reasonable and desirable ones 

f or th^ course in que^stions. 

'y ■ ' ■ ' . • 

' 2. The objectives lis^jld are' (for the vast majority) the topics. 

. ' ■, ■ ^ ■ J ^- • ■ ... 

" I intend 4>p cover in the course, (The teacher may have some 
additional objectives not included on the list.) • 
A set of. test' items* must be compiled that test each objective- 



listed. Theise items* 'are either composed after the j 
objectives are written \or, collected from a pool of test items ^ITat: 
may already be in existence for .the course. " There must be at lea'st 

V ' 

one test item for each objective; and no test item should ^refer ^to 
a topic not covered in the list- of obje^ctives. Ideally, several 
test* items wolild be written for each objective. 

. Once the sizes of the item domains for both attitudim^l and 
cognitive objectives have been determined, a computer program can be 
written (e.g. Barcikowski & Patterson, 1972) which is designed to 
select items at random from each domain and"'print tests. Eacl^ item, 
cognitive and attitudinal, is numbered ^nd either typed on computer 
cards or stored on computer tap^*. The number df items to be cliosen 
for each subtest. from each domain .will be determined by* the size of 
the class, the sizes of the item domains, and the time available for 
testing. The number of subtests of each type to be printed will de- 
pend on the number of students placed. in each subgroup. 

The principal advantages in theory of the B-model for measuring 

progress of groups of students are the following: 

■ ■ * ■ 

1.. Information that shows progress of students toward 

accomplishment. of the objectives for the. term is.,given 
to students and teachers on a regular, frequent basis. 
^ 2. Students and teachers can watch the curve of accumulated 
. learning ascend as the students' knowledge increases. 
This should have a motivating effect on students and 
teachers alike. 

.3, 'Teachers can be alerted to attitude changes of the students 
. 4. Teachers can compare learning curves from' one groqp .to 



.12 . 

another.' They can then investigate the possible reasons 



for differences in learning curves from one term to th& 



f 

next . 



5/ Supervisors and administrators can use tshe learning curves 
.to identify consistently outstanding teachers with the 
idea of attempting to determine possible causes for their 
consistent superiority. ^ 

• . . The ultimate use of the B-mo4el is to provide information . . 
available in no- other convenient way that can be used f^r improvement 
of instructio^n. It is a systematic approach to measuring group ^ 
progress. 

It should be npted, 'however, that the B-mod6l is not considered 
suitable for all types of classes. While all classes have cognitive 

• and affective objectives, not all of these lend themselves to , 
measurement by short-answer objective tests. Some typeS' of objectives 
require too much time for measurement in a 'multile>atrix -sampling 
plan; Subjects such as history, mathematics; sc^tnce, and certain 
courses in English are the kinds considered suitable for measurement • 
with the B-model. 

Problem and Methodology - 
The problem to be answered in the study is the following: 
Is the B-model practical and' feasible^ foSyneasuring the 
achievement and attitudes of students over time? 

Classes in which the model was tested were if iye sections of 
EDRE 501, Inti^oduction to Research Methods, offered at the Ohio 
'university dtiVing the fall quartei^^ 1978, taught by three different, 
instructors. Two of the classes were of f ered on W main ^ampus, one 



in the evening, one in the moming, the^ther three classes we^e 

offered on three branch campuses in the evening. All classes were 

J ' • / - 

offered once a weeK and met for three hours. 

I . ■ ■ 

The study wasi originaHy designed to have five different 
instructors, however, two instructors, l^ft the University for reasons 
unrelated to the study. The remaining three instructors were assigned 
so that one instructor taught three classes (pne branch an(J the two ^ 
clarsses on campuk) and the other two instructors each taught one class 
off campus. ^One instructor was a male full professor who had taught 
this course, or a similar course, at least once each year for the past 
twenty years. jAnother instructor was a male assistant professor who 
had taught thi^ course several times over the past three years. The 
thirxl instructor was a female who had had ten years experience 
teaching at the high school level, and, who had finished all o^ her 
course Work towards her doctorate in Educational Administration. The 
latter instructor had taken tfiis course as a student,. but had never 
taught it. All of , the instructors knew that they would not, be ^ 
identified in the report of the study. ' 

The following plan of procedures was followed: 

1. A list of objectives for the course, in both the cognitive 
and affective domains. Was prepared. ^ 

2. 'An Item domain that. was congruent to the list of objectives 

was assembled. - 

3. .An instrument to measure attitudinal objectives was 
• * compiled $iid piloted* 

* 4. The number of items^ from each item domain that needed to ^ 
• .^ be sampled for. each subtest were^' deteirmined using procedures 



\ .described in Sirotnik (1974). , 
5|. A computer program was written that selected and printed the 
requisite number of copies of each subtest.. . . 

6. A signed consent to participate in th^^s study was obtainedy 
from each student in all five classes^ 

7. ' Demographic data from students in^each EDRE 501 class was 

collected using the' form shown- in Appendix B* 

' 8. Tests were administered at weekly intervals to* the students 

■ » ' ■ > - 

in the EDRE 501 classes. \ • |' 

9. These tests^were' sdpred and class mean? wer^ determined 
each week. ' , • ^ / * 

- 10. Clo^e account was Kept of all time spent writing objectives^ 
writing test items, administering and scoring tests. 
11^ The mean for each^class was plotted on a separate graph 

and copies were distributed to the instructors. Each sub- 

■ * ■ " ■ . ' . • . ' ■ „ . ■ . 

— sequent mean was plotted on the same graph to indicate class 

' . ■ *' ■ . 

progress. , ^ ^ 

Criteria for Success • ^ 

The factors that distinguish the B-model from the other models , 
are the simultaneous use of pupil-gain measures and attitude' change 
measures, economy of teacher and student time, and provision of 
helpful information -throughout the^thool term. It was decided before 
hand that the model would* be judged successful if the following 
criteria were met:. 

1\ The multiple matrix sampling technique must be able to 

measure, changes in student achievement and attitude. This 
would be shoim by the curves on the graphs of achievement 



\ and attitude for each ,f lass, The differences in means between 
. tim^s would be tested for significance at the ,05 level using 
, 'a multivariate repeated measures design, 
2, The expenditure of classroom tijfiel^for testing must not be 
jiidged too high by the instructors. The exact amount of 
time involved fot giving instructions and for administering 
/' test items would be recorded. The determination of wh^fier • 

' /or not this time is excessive will be a subjective judgment 

■ •■ ■ ^ 

based on the instructors' opinions, 

* 3. The instructors must find that the information on achievement 

■ ^ ■ '■ ' , 

and attitude contributed to their understanding of the pro- 

gress of their classes. The determination of the worth of .. 

this information would be' a subjective judgment by .the 

> ■ ^ « ■ ■ ' 

instructors. Each inst^ctor would be asked to respond to 
. questions about this using a structured interview (Appendix 

. ■ • . ■ ■ . ■ . , ■ '■ ; ' .a' •, \ 

. ■ ' ' . Results ■ ''^ : 

■ ■ . ■ .■ . ». . . ^ 
• -. - • . ■ •: • ^ ■ ■ • 

Construction of the Cognitive and 

■ ■ ^.-^ . fe' ■ ■ ^ .■' '' ' ' 

and. Attitude Domains . * . 

V A list of 165 cognitive and 14 atttitudinal objectives were 
compiled and agr,eed Upon by* th6 instructdrs, 'An item domain was then 
establ^:ihed which was congruent with the objectives, ?nd which 
consisted of 238 items* measuring student achievement and 58 items 
measuring student attitude. . Of the 238 achievement items, 124 were judged 
by the instructors to ^be knowledge items, 75. were judged to be 
understanding items, and 38 were judged to be application itiems. 
Ail of these achievement items were tiaken from past tests for this 



ERIC 



course. T^he -Kuder-Richardson 20 reliabilities of these past tests 
ranged from ,63 to . 88,. oi} tests composed of from 25 to 50 items. 
The attitude ateras were constrQcted based on information 

gathfered f^oT^hi^^^E^^^^ 1978 s^er • 

sessioi^s at Ohio University. Initiall^^ twenfy-one jtuden/ in ona 
EDRE '501 class were asked to' respond/to eight open-ended- questions 
concerning their likes and dislikes towards* educational research. 
From the responses to these open-ended questions a list of l6o 
attitude items was compiled. These attitude items were then tried \ 
on a grou'p of 1*9 students in a second EDRE 501' class , and based on 
their responses items were modified or deleted'. : A revised list of 
70 items were then given to twenty-two ^students in a t^ird EDRE 501 
class. The 58. iterns for the final attitude Instrument were sefectrfS 
because they yielded mean differences between groups who scored high 
' arid'lov (total attitude score plus or minus .^standard deviation 
above or below the me^) of at least .2 'of a standard ^deviation and 
had a correlation with' the tptal t(^st sfiore of at least .25. The 
final attitude instrument had 35 positively worded Ttems and 23. 
negatively worded items. * 

Classes j- ^ , 

A brief description of ke students who enrolled in: the five 
Classes* in this study is shown in .Table 1 . The iliformation in Table ^ 
\ was aritived "at firom the background information sheet in Appendi|^|| 
In Table 1 it can be seea that the classes differed considerably 
on whether the students were fui^ time (registered for fifteen or 
■more hours) or part^ime (registered for less than fifteen hours). 
' Classes 1, 2,. ^nd 5 were composed of primarily part-time students; 



Table 1 



Frequency Counts anaN^|§jJcentages 
Students io Various Categories 
i\crossythe I^ive Classes 



a of 



Class 



Category 


1^. 


• 2 


3b,c 


4b, c 


5 


• Cfverall 


Type of Enrollment 


Full-time 
Part Jtime 


4(27) 
11^:73) 


1(9) fr(60) 
10(91) . SC^tO) 


11(100) 


0(0) 
8(1000 


• 28(43) 
37(57) 


7 \ — 


Undesgxaduat 


^Degree in Education • ' 


Yes 
No 


10(67) 
5(33) 


10(9: 
l(9j 


.) 14(70) 
6(30) 


8(73) 
3(27) 


5(63) 
3(37) 


.47(72) , 
18(25) 








Age 








18-25 
26-35 
36-45 
46r55 


5(2) 
9(60) 
■ 2(13) 
1(7) 


5(45) .10(50) 
5(45) » 8(40) 

1(10) 2(10) ; 

0(0)' 0(0) • 


3(27) 
7(644 
1(9) 

0(0) ■ 


3(38) 
4(50) 
1(12) 
0(0) . 


24(37) 
33(51) 
7(11) 

1(1) 

















Male 
Female 



1(9) 6 (30) M55) 1(12) . *19(29> 
10(67) 10(91) 14(70) /5(45) 7(88). 46(71). 



Claiss Size 



Class Size 



15 



11 



20 



11 



^65 



^Percentages are in parentheses . 
taught by one instructor. 
*^T,aught on ca^us : 



^plass 3 had ele^^en 160%) full-time and eight (40%) part. titne students; 
and class 4 had all full-time siudents. Ml of^ the classes were com- 
posed primarily of ^tud^ho had received their undergraduate " 
degree, in Education. Across all of .the classes the'-students were 
primarily (88%) in the 18-35 age range, however, classes 1. 4 and ^5 
had slightly older; (26.55) students, and classes 2 and 3 had about 
half of the^r<^udents in the younger (18-25) age range. All of the 
classes were primarily made-up of females, except fbr class 4 which 
wasapproximatelyevenly split between males (55%) an^emales (45%). 

Classes 1, 2 and- 5 were offered in the branch campuses, and ^ • 
classes 3 (evening) ^and 4 ^^orning) were offered on t>he main campus. 
From Tabl"^ 1 the main iisti^ction^between the b,^hcV and main campus 
students was that most (68%) of the main campus .stuaents were 
enrolled full-tim^. whil. most (85%) of the off-campus students were 
enrolled part time. One instructor taught classes 1, 3 and 4 . • 

Test Size and Time . * ■ 

As can be seen, in Table 1 the classes consisted of different 
numbers of students, following -the multiple matrix sampling pro- 
cedure (See Appendix A) this meant that ^ the size of the test taken 

in each class was dependent pn the* nVb«' °^^'"^""''^. '^^"'^ 
.in all case'i the total item domain bo^th cognitive and attitude were 
used. Th\s meant that in Class 1 with.. 15 students eafch- week, each ' 
■ student took^ "a 15 or 16 item achievement test, and a 3 or 4 item 
^attitude test. The approximate size of ^ach test taken each w^ek 
. from each item domain, is shown in table 2 along wUh the average 
' time spent taking each test. As can be seen in llSle 2 the average 
• testing tines ranged from .13.6 minutes to 17. 7 minutes . Therefore. 



19 . 



Table 2 ^ 

Approximate dumber of Items Taken Each Week 
in the Five Classes and the Average 
Time Spent in Testing 










Class 






1 


2 ' 




5'- 




Approximate Number 


pi Items (Average Time 


in Minutes) 


Achievement 


16 


22 


12 22 




• 

Attitude • 


4 


5 




7 


Total 


20(14.3) 


27(13.6) 


15(13.6) : .27(14.7) , 


37(17.7) 



22 



4 



ERIC 



' ■ ■ ■ ' i 

^ ' .20 

for future testings With item domains similar' in sij:e to the ones' 

- / * • ^ ■ c ■ /" 

used here, and with classes larger than. 10 students, one might 

plan on a test -per y)d of approximately 15 minutes. 

\^ In order to ajllow for absentees it was necessary to construct 

a, series of 36 test groupings. This series allowed the' total itemri^r * 

domains to be tested each week and also controlled so that no 

^student in Classes ! tiirough 4 ever took the same it^jn twice. In 

Class 5 there were only eight^'Stiiidents and 10 testings, theref ore,- 
* « 4 ^ * * • * ' 

each person t6ok items they had seen, before (but at different times) 
during the last two t^^tings. Tests were distriUuted at random 
during t4ie first se^srion, but then students -dnd tests were kept track 
of to eii5ure that no student took the same test^• 

{^Approximately 40 hours were spent collecting, classifying, and 
collating the items for the cognitive domain. Another e^ight hours^ 
were spent revising and correcting these il'ei^. Approximately six 
houxs were spent writing the initial 100 attitudinal items, witH, an 
additional -seven hours spent revising the attitudinal items and. 
objectives, and testing these items'. Therefore, approximately 61 
hours were spent preparing the item domains. The computer* time 
. 'required to pr^lare and score the tests each week was 25'.4 seconds; 
tlfe himian time needed each week to pull the tests apart, distribute 

• 4 

them, and have them scored was one hour and 38 minutes. 
Analysis of Class Means • - 

The estimates of the class means'^f or the cognitive items are 
plotted in Figure 2 and -t^ estimates of the class mdans for the 
attitud^ items 'are plotted iit^ Figure 3. , The overall multivatriate 
repeated measures analysis of these class means is shown in Table 3 




■ ! . . Figure/3 . ^ 

' Tke Oianges in Class Mean on the Attitude D«ain 
J, 'fir the Five Classes in This Study 



5. , / 



fable 3 



Overall •Multivariate 

.. .withajl five '.Classes 




Sums of Sjuares 
: and Products . 



Multivariate^^, . ^ 



Univariate 



• ; lilb-' 
Df Lambda 



Df Achievement ' Attitude 




Multivariate Repeated 'Measures Trend Analysis 
' Over Tiie with air Fiye Classes. 




Linear 
: Trend 



Other 



Error 



Multivariate 



Sums of Squares 
and Products 



Wilks' 




2,35 .- .28. .44.02*(.0001) 



16,70 . .76 • .64 (.8401), 



J\ ■■ Univariate 



Of Achievement,,, Attitude- 



1 90.52*(.0001) 1.52(.2262) 



,58.(.7859) V .83(.5815) 



36 



♦Sienificant. at p < ...OS ■ 



with the trend analysis oyer time shovm in Table 4. The overall 
multivariate analysis of the means' for the three instructors, using - 
tlie unweighted average of, the means of the three classes taught by 
one instructor, is showii in Table 4 with the trend analysis' over ' 
time shown in Table 5. 

In Figure 2 the learning 'curves for the classes are linear 
with a positive slope, and there appears to be class differences with^ 
respect to achievement. In Figure 3 the attitude means show no 
trend over time and- no differences between the classes. These obser- 
vations are supported by the results shown in Tables 3 and. 4. In 
Table 3 'a multivariate significant difference is found between the . 
class means over time CF = 4.13, p < .0001) and most of this dif- 
ferehce is due to achievement (F = 10.56, p < .0001) and not 
attitude (F = .91, p< .5298). In Table 4 a multivariate linear 
trend over time is indicated (F = 44.02, p < .0001), and the linear , 
trend is found over the achievement means CF = .90.52, p < ,0001) but 
not over the attitude means CF - 1.52, p < .2262). The results in 
Table 3. also indicate s multivariate significant difference among 
classes CF =8.41, p < .0001) with this difference primarily due to 
achievement CF = 22.81, p< .0001) and not attitude CF = 2.33,- p < 

.0742). . • - . ■ 

The statistical' analyses in Tables 5 and 6 are the same as those 
shown in Tables 3 and 4 except that the comparisons were made with ^ 
respect tto instructors and not classes. In these tables the class '^-^ 
means from the classes taught by one instructor were averaged to 
represent him. The results are the same as those reported in 
Tables 3 and 4 with the exception that the overall multivariate 

■ '■■ on ' . 



him 



f- ♦ ■ 



Overaj.l Multivariate Rlpeated'Measures Analysis 
■ With Three (jlassesV 




Sums of Squares, 
and Products 



Multivariate 



Univariate 



Source of 

. Variation Achievement Attitude Df Lambda F (Probability) 




Time 



Instructors 



,0967 . \0441 

y 

.0441 .0364 

I 

'.1252 .2648' 
.2648 .'6034 



FtProbability) 
Df Achievementj^" Attitude 

7 



18 ^.22 2.16*M257) 



4 : .21 10.04*(.0001) 



9 4.71*(.0026) .7812(.6363): 



:2''.2J,43*^ 



■\ r 

' 0 



.Error 



.0411 .0059 
.0059 .9313 



34 



18 ' 



Hhe scores for the three classes taught by the same instructor were averaged to form the third class. 
♦Significant at p < .05. 



( •• • Tablje6 ■ - ^ . 

' Multivariate R^eated Measures Trend Analysis 



V- ■ : 

\ 



( 



Source of 



; Suis of Squares 
■ ■ and Products 



Multivariate 



Univariate ' 



Wilks' 



liariation . 'AchieveiDent Attitude Df Lambda F (Probability] 



F(Probability) 
Df Achievement Attitude 





.0923 .0396 
.0396 .0170^ 
'.0045 .0046' 



2,17 .31 '19i*(.0001) 1 40.43*(.0923) ,33(.5735) 



.16,34 .66 . .50 (.9324-] 



8 .24 (.9760) .84(.5821) 



Error 



,0046 . .3468. 
'.0411 ...005? 
.0059 ' .9313 



18 



♦Significant at p < .05 



significant difference between the instructors CF =10.04, p < .0001) . 
appears to have been due to both achievement (F = 27.43, p < ,0001) 
and attitude (F = 5.83, p < .0112). This result led to the plotting 
of the three instructor's classes" attitude means in Figure 4, The ^ 
:plot of :the attitude means ii) Figure 4 indicates that^ t^^ .* 
interactions in th^ data making the overall test difficult to 
interpret. However. Class 2 does have the lowest pattern of attitude 
and did' finish the class with a lower attitude mean than it started 
with. Classes 5 and C had a higher pattern of mean attitudes and 
" both finished higher than they started. . 
Instructor Opinions ' 

The information discussed in this section is based on the 
structured interview questions found in Appendix C. In response to ; 
questions #1, #2 and #5 concerning worthwhile achievement and attitudel 
information, the instructors indicated that they fqund the information 
interesting but that they made no use of it. In j0^nse to question^ 
#4 and #5 concerning infoxmation to students^ the instructors indicated 
that whil«i the students showed some interest in the class's progress, 
they were primarily interested in their own individual progress . In 
response to question #6 two instructors thought\ha$ the testing 
periods did not require %n excessive amount of time, and one instructor 
indicated that the. testing time added up to one class. period, and 
that was substantial for a class that meets only once a week. The 
instructors responses to questions #7 and #8 concerning information . 
gained from this study, indicated that by itself the information was 
not of value to them in ^their teaching, but that if the class in- 
• format/ion could be .cohsidered with respect to the other classes, or ^ 




V 




4 . ^ . ■ 6 
TIME IN WEEKS. 



ThB Changes iii Class Mean on the Attitude D^dn 
for the. Three Instructor'sjClassps.-C is the 
:Coiibined mean .(Average) for One Instructor 



37 



■Ml 



with respect to npsjmatiye infonnatioi^, it might serve to motivate 
them to improve their teaching. ' , , / 

. . Discussion ' 
Although a good deal of cafe w?fs put into implementing this 
study of the practicality and feasibility of the B-model, the results 
.can only be considered as ejqploratory in considering the model's ^ 
full potential. This was the first implementation of this model an^>^^ 
It was done on a small scale' with only five classes and three teachers. 
However, some of the results, particularly those in the measuriement 
of the cognitive domain may be considered as particularly encouraging.- 
.If one reconsiders the distributions of cpgnif ivejrfcias^ mean, estimates 
over time, shown irt Figure .a, it is' interesting to note that although 
the classes start in different positions on the. first testing, by 
the second testing the classes establish a pattern that seems to be 

reHecting teacher differences. Here the teacher with Classes 1, 3, ^ ; 

• ■ • . ■■ ■ ■ ■ ' 

and 4 has established , the median pattern of achi^ement; the teacher 
in Class 5 the highest pattern of achievement;^ an4. the "teacher in 
Class^ 2 to lowest pattern of achievement. What is of iTite^est is ^■ 
.that the classes taught by one instructor had mean achievement that 
was 'SO homogeneous and yet distinguishable from the achiev^ent of - 
the other two clashes ^.-Thts^pattfrn W^nostjinterestiilg Wh^^ 
■ considers the variety pf bacicgtox^^^ exhibi^^5d;^y Classes 

1, 3 and 4 in Table l;, .w6uld^^ be fascinating as. kn 

e:qperiment where subjects Were ran^ to classroom^? Is 

it the teacher in Class 5, the students, or something that this teacher 



is doing that is causing thV high achievement (or is it^ simp^I>kl3ie ^ 
fact that these students took more items)? What is 'happening in 



Cl^isses 1, 2, 3', and 4f Although .this data may simply be an artifact 
of its small-scale, the achievement results certainly encourage its . 
implementation in a larger scale. ' • ■ 

: ' The results of the instructor opinions >ere not judged to .b0 

■ . .■ ' "... *-•■'■',:•'-.,■'-■ 

encouragiiig .for us^ of the B-model. But this seemed tt>vbe^ 
the way the results were presented to each instructor, iffith- no 

'perspective. That is, the instructors dW Tiot Tcnow if t^^^ were doing 
well or not they had no'criteria on which to make a judgment as 
tb their classes progress. This is an indication that ^in /future use 
of the model more effort should be put, into /providing' the 
with comparative information - although this may requir 
the same instructors over . several years. CWhat'might happen to V: -^ 

. class achievement'if an experimental study were conducted where ' 

^ . '■. '. ' ^ . ' '''.•'"tv 

/dnstructors were ihown leading curveis with .th^ir c or ' 

• ''below the norm?) ^. • • . 

* . . ■ . * .r , 

* ' • The. measurement of student attitude towgfwls educational re- ' • 
search failed to indicate any reliable teacher difference or dlff§;renc 



over time. The results in Figures- 3 and 4 indicate that, at l^st 
on the instrumeft| /^cpnstruct teachers- had high 

positive attitudes towards leducatipnal research, and that these 



attitudes did not appear to' be strongly affected by their teachers. 



ft, ^ ^ Conc lusion 
In their recent ,pook on school effectimness^M^ Aivasian, 

• - - . ■ ■ • 

ariS Kellaghan (1980) indicate that in ^t Stsiif^^i^y^tOTs 
have traditionally usel^appropriate inea^u^^ (i.e^^^^rt^rdized 
tests) tcr^study the eff^^l^ ^of school^^ these 
Inappropriate meas^Hff'^^ partially responsible j^or the ^ 



32 



r^suWs sho^^^ (e.g. i Circirelli, ti al . , 1969; _ ;v 

Coi^m^i et al.,;:l970; Jenseti;^;!^^^ 

^ strong effect* student; a.c}ii:e^^ 
r beyond thkt'accouhte^ of social4claS$ and hpine^ ^ , 

y ,.•.,•'••;:*■>.*•■*• ' ■ . ' ■ * • ^ " 

background. TJie results presented here indicate tha^ ~*lie%^^ 

may provide researchers and school administrators withya^ 

' * . ■ ••■'■V ^ »» * • ■'■ ■ ■/ ■ V,--*''. '/ 

measurement approach, which is economical in terms of teacher pd 
; student time, with which to measure pupil progress, a^ pe^nap^s^ ; 
teacher effectiveness, throughout the schoo! year. . V-^^^^^^ 




.A. 



References 



Austibel, D. B. Educational psycho logy; A cognitive view . Nev#.York: 
Holt, Rinehart.and Winston, Inc., 196H. 

Barcikowski, R. S., ^ Patterson. J. L. A computer program for 
randomly selecting test items' frpin ^ item population. Edu- 
. Gatiopal -and Psychological Measurement , 1972, 32, 795-798. 

Bifehler ; R . F . Psychology applied to teaching . - Boston : Hoiighton- 
. Mifflin Co., 1971. v . 

-Biair, G., Jones,. R., & Simpson, R;' Educational psychology. New 
York: MacMillan Co., 1967. . 

ticirelli, V. G., et al. The impact of Head Start. An evaluation of 
• the effects Of Head Start, on children's cognitive and affective 
development. Study by Westinghouse Learning Corporation and 
Ohio. University. Washington, D.C.:» Office of Economic Oppor- 
tunities, 1969. • 

Colemah, J; S. Campbell,' E. Q. , HobsOn, C. J. , McPartland, J. , Mood, 
A.;m.. Weinfeld, F. D., S York, R/ L. Equality of 'educational 
' opportu^lt^i Washington, D.C.: Office of Education, U, S. 
Departmoitif; of Health, Education and Welfare. 

* CroiCaiW'' L; Course improvement through evaluation. Teachers College 
Record , - 1963 , 64 , 672-683 . , , 

Ebel, R. L. Essentials of educational measu^etit.? : Englewood Cliffs, 
New Jersey: Prfeftt ice -Hall, Inc. 1972:^ . 



Ebel, R. Declining scores: A conservative ejt^^at ion.. Phi Delta 
K&ppan , 1976, 5£, 306-310. ^ 

French, J. W. Aptitude and interest score patterns related to satis- -{^.^ 
faction with college major field. Educational and Psyc hological. 
Measurement , 1961, -21, 2, .287-294. 

Gallup, G. Ninth-annual Gallup poll, of public attitudes toward edu- 
cation. Phi Delta Kappan , 1977 , 59 , 33-48. 

Giass , G . V . , et al . Data analysis of the 1968-69' -survey of compensatory 
education,:: Title iTFiHal Report TJo.-0EG8-8-961860 4003-C058J > 
. Washington, D. C, : U. S. Office of Educpition, 19?0. ^ 

Hilgard, E. R. Theories of- Ifeaming. 2nd ed . ■ New York ; Appletonr 
Century-Crofts, Inc., 1956. --i;'- .'•^ 



Hilgard, E. :R.i § *Bower, G. H-. Hiedries of learning ; ' 3rd ed. ^.^^{y.^-- 
v5 York:- 1/^plet^^ Inc;. , 1966 . " 

jihhson , bV W. Affective outcomes . In H. S . ^falberg (Bd. ) Evaluiating 
educational performance . Berkeley, California: McCutchan Press, 
1974, 99-112. r ' .. >' ■, ! ■ _ - •■■i -y",::^'--^/- '. '■. 

Krat'hWohli 'D. R., Bloom, B. U. , and Masia, B.' B.' taxohomy of- edu- 
cational objectives,' Handbook ii: Affective domain .; New York; 
-;; *. David McKay Co. „• Inc. , 1964.^ [ ~" ■ " 

Leinh&rdj;, iferf pirogriff .evaluation:,. An empirical study .of 'individualized 
'in'stiTjctiow: r /^^^ Educational Research Journal , 1977, l£, 

• 277-293. ■ ■ ■/ ' . ■■ .. , 

Madau^, G. F., Aivdsian, P. W., § Kellaghan, T. School effe ctiveness. 
New York: McGraw-Hill,. 1980. V " 

Porter, J. The virtues of a state. -assessment program. Phi Delta 
■ , Kappan , 1976, 57, 667-668. • 

Shoemakar, D. M. Principles and procedures of multiple matr ix sampling. 
Cambridge, Massaclbiusetts: Bal linger, 1973* - ^ 



Shoemaker, D.M. Toward a framework for achievement Resting, 
of Educational Research, 1^75, 45^, 127-147. , ■ 



Review 



Shoemaker, D. M. The contribution of multiple matrix sampling-to 
evaluating teacher effectiveness. In Borich„ G. D. (ed.) The 
appraisal of teachings Concepts and proceiss . Reading, Massachusetts: 
•Addison-Wesley Publishing Co., 1977, 292-300. 

Simonson, M. R. Attitude change and achievement: Dissonance theory 
in education. Journal of" Educational Research , 1976, 21_, 163-169, 

Sirotnik, K. Introduction to matrix sampling for the practitioner. ' 
in Popham, W. J.f Cecf.) Evaluation in education . Berkeley, Calif.: 
McCutcheon, 1974. (Also available as a separate paperback, same 
publisher, same year.) : 

Worfhern, B. , Saiiders , J. Educational, evaluation Theory and 
prqbticfe . Worthingtoii, Ohio: Charles A. Jones, 1973. ^ 



ERIC 



35 



i Reference notes 



Bar cikowski , R . , Upp , • C . A model system for the evaluation of 

' teacher effectiveness w Unpublished manuscript^ 1978. (Available 
from Robert Barcikowski, Ohio University, Athens, Ohio, 45701.) 



J0' ' 



V 




43 



ERIC 



APPENDIX A 



MULTIPLE MATRIX SAMPLING 



4 A 



ERIC 



Multiple matrix sampling is a method of collecting group data 
with the expenditure of a very small amount of time and money as 
compfired to the traditional census method of collecting data. In 
the census method of testing/ aU items are administered to all 
students. For example, 25 arithmetic items might be administered to 
^a class of 30 pupils, with each pupil being tpsted on all 25 items. 
Individual achievement data may be collected in this manner and group 
statistics may be derived- from the individual data. Note that 750 
items (25 items x 30 pupils) would need to be scored for the example' 
given. If individual data are not needed multiple matrix sampling 
can greatly reduce the nuinber of items to be scored, thus providing 
economy of pupil and teacher time. 

To apply a procedure of multiple matrix sampling the gtoup of 
test items is divided into subtests, and the group of examiniees 
is divided into slabgrpups of examinees, Thisd^l^ne by a p^cedure 
of randomization in both cases, . For a OTiall number of items -an4 v 
examinees as given in tlie example dbqve, this can be done by using a 
table of random digits. If the decision has been made to reduce 
testing time to one-fifth, items and examinees are jfandomly dividied 
into five groups, .Students are assigned sequential numbers beginning 
with 1. Test items are assigned sequential numbers tieg inning with 1, 
The. table of random digits is then used to select the items for each 
subgrouj). For the example above, the riandom arrangement would give 
five subtests, which might contain the following^ items: 
Subtest 1 - items 2, 8, 16, 5, 17 
Subtest 2 - items 4, 14,^7, 20V 15 



38 

Subtest 3 - items 10, 19, 25, 6, 18 

Subtest 4 - items 12, 22, 1;, 13, 3 ' ' 

Sulijtest 5 - item^^^^^ ^ ^ ^ ' 

A randdm arrangement of 30 students. into five groups might produce 
the follbWing arrwigement: ■ ^ 

Suljgroup 1 - students 7, 18 ;j 28, 4, 5, 20 
Subgroup 2 - students 3, 22,* 27, 2, 17, 21 
Subgrbiip 3 - students 29, 13, 11, 14, 23, 26/ , 
Subgroup 4 - students 6^ 19, 15, 30, 24^ 9 

. Q >■ ■ ■ , . ■ ■ 

Subgroup 5 - students 1, 8, 10, 12, 16, 25 
Subgroup: 1 would then be given subtest 1, subgroup 2 would be 
given subtest 2, etc. ' Each subgroup of students is given different 
group of items, a fraction of the size- of the original test. A mean' 
score for each subgroup is computed" Cthe tiumber of items answered 
correctly by students in each subgroup divided by the total number 
of possible respbnjesfor that Subgroup). From these means the mean 
of the entire group is computed. This mean of the subgroup means, is 
an unbiased estimate of the true mean of the group and will correspond 
very closely with the mean that would be determined by administering 
every test item to every student. Note that in the example given only 
one-fifth as much time tor test administration was required and only 
one-fifth as many items (30, students x five items each) need to be . 
scored. Sirotnik (1974, p. 461)-and Shoemaker (1973, p. 5 ) both 
indicate the accuracy of multiple matrix sampling as an estimator > 
of group means and both give excellent examples. 

When the- aim of testing is to measure the degree^ of accomplishment 
Vbf the objectives for a complete course of study, the test item 



'^ool might easily consist of eeveral hundrecL or even two or three 
thousand test quest^ns. This would, of course, necessitate the assign- 
ment df morie than five or six items 1^ each subgroup of pupils. The 
exact number of items, to be assigned is determined'by the size of 
the item pool', the number of students to be tested and the degree of 
accuracy desired. If the item pool contains fewer than 500 test 
items, every question should be answered by at least one subgroup 
of pupils (Sirgtnik, 1974, p. 467); If ^lie item population is larger 
than 500, it can be considered as of infinite size and sampled randomly 
to obtain subtests of appropriate size, A inore complete discussion of 
"appropriate size" may be found in Sirotnik (1924) or Shoemaker (1973). 



/ 



APPENDIX B — 
DEMOGRAPHIC DATA 



'^1 



48 



ERIC 



STRUCTUREO INTERVIEW 



1 . Did you receive any woTthwhile itifor^ation on the achievement 
of your class during the progress of the study? 

2. - Did you receive any helpful information on the attitudes of 
' * your class during the course of the study? 

3. Would )rpu have received this information without the study? If 
so, how? In what form? 

4. Do you feel that your students were interested in the shape of 
the learning curve as it developed? 

5. Do you think that knowledge, of the class progress was b^neficikl 
or harmful to the class? ' ■ - 

6. Did the testing periods require an excessive amount of class time? 

7. Did the information you received assist you in understanding the 
progress of your class? 

. ' . . ♦ ■ , 

8. Would you like to continue to receive this kind of information 
ab^ut ypur classes? 




49 



J^PPENDIX C 
STRUCTURED INTERVIEW FORM 



50 



ERIC 



■BACKGROUND INFOHMAtION ^ 



To heip,in the analysis of data for the study you are participat^^ng- 
in, some background information is required. Please answer the 
questions below. Yqii hebd not sign your name. ^ 



1. Full or part-tim^' student?___, 
' 2; If employed, =whe^e?_^_____ 



■» ■ 



3. What positioh .dp- you hold?_^___^^____^_ \ — 

4. What position do you hope to hold after you complete your studies? 



5. If you are a teacher, how many years have you taught? 

6. Major field? • . 



7. Age: 0-17 



8. Sex: 



